The stock market is a complex systems that has multiple attributes of interest for data science: dynamics, correlations and stochastic behaviour. Traditionally, forecasting of stock prices requires the calculation of summary statistics such as MACD, RSI, or moving averages. With knowledge from trading, we can use multiple indicators to estimate what the price action will be. New methodologies, however, allow for automated forecasting from stock price data. For example, Long short-term memory (LSTM) neural networks enable forecasting of time series using deep learning. The first step, however, is to collect data, a lot of data.
Vantage Alpha is a popular library for downloading stock prices data, however I found tha it is rather unreliable since it often imposes limits on the amout of data one can download at a time. I recently came across with BatchGetSymbols, an R library that enables automated retrieval of stock market data in fast, reliable and efficient way.
#install.packages("BatchGetSymbols")
library(BatchGetSymbols)
library(ggplot2)
first.date = '2018-01-01'
last.date <- Sys.Date()
freq.data <- 'daily'
# Use Yahoo tickers
symbols = c('MSFT', 'GOOGL','NFLX','NVDA','FB','SHOP','AMZN','ZM','SQ','SE','SPOT')
stock.prices =BatchGetSymbols(symbols, first.date = first.date, last.date = last.date, freq.data = freq.data)
# First element in the list is meta data of the download
Here we retrieve data on a few of the main tech stocks from the last years. All we need is the list of tickers and the date range:
stock.prices[[2]] %>% ggplot(aes(x = ref.date, y = price.close, color = ticker)) + geom_line() + theme_classic() + theme(text = element_text(size=20))
Here is the meta data for the download:
stock.prices[[1]]
## # A tibble: 11 x 6
## ticker src download.status total.obs perc.benchmark.… threshold.decis…
## <chr> <chr> <chr> <int> <dbl> <chr>
## 1 MSFT yahoo OK 733 1 KEEP
## 2 GOOGL yahoo OK 733 1 KEEP
## 3 NFLX yahoo OK 733 1 KEEP
## 4 NVDA yahoo OK 733 1 KEEP
## 5 FB yahoo OK 733 1 KEEP
## 6 SHOP yahoo OK 733 1 KEEP
## 7 AMZN yahoo OK 733 1 KEEP
## 8 ZM yahoo OK 408 0.557 OUT
## 9 SQ yahoo OK 733 1 KEEP
## 10 SE yahoo OK 733 1 KEEP
## 11 SPOT yahoo OK 671 0.915 KEEP